Quantification of the effects of chimerism on read mapping, differential expression and annotation following short-read de novo assembly.
نویسندگان
چکیده
Background: De novo assembly is often required for analysing short-read RNA sequencing data. An under-characterized aspect of the contigs produced chimerism, extent to which affects mapping, differential expression analysis and annotation. Despite long-read negating this issue, short-reads remain in use through on-going research archived datasets created during last two decades. Consequently, there still a need quantify chimerism its effects. Methods: Effects on mapping were quantified by simulating reads off Drosophila melanogaster cDNA library these related reference sets containing increasing levels chimerism. Next, ten read simulated divided into conditions where, within one, representing 1000 randomly selected transcripts over-represented across replicates. Differential was performed iteratively with set. Finally, an expectation r-squared values describing relationship between alignment transcript lengths matches involving those incrementing created. Similar calculated three graph-based assemblers, relative from input simulated, or sequenced (relative species represented), compared. Results: At 5% 95% sets, 100% 77% mapped, making success poor indicator over-representation, 953 identified analysis; at 10% 936 identified, while it 510. This indicates that despite success, per-transcript counts are unpredictably altered. R-squared obtained assemblers suggest 5-15% chimeric. Conclusions: Although not evident based had significant impact megablast identification. will have consequences past present experiments short-reads.
منابع مشابه
Clustering of Short Read Sequences for de novo Transcriptome Assembly
Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...
متن کاملclustering of short read sequences for de novo transcriptome assembly
given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. in this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. first, the contiguous sequencesare generated using de bruijn graph with d...
متن کاملthe effects of planning on accuracy and complexity of iranian efl students’ written narrative task performance
this study compared the different effects of form-focused guided planning vs. meaning-focused guided planning on iranian pre-intermediate students’ task performance. the study lasted for three weeks and concentrated on eight english structures. forty five pre-intermediate iranian students were randomly assigned to three groups of guided planning focus-on-form group (gpfg), guided planning focus...
15 صفحه اولthe effect of vocabulary instruction through semantic mapping on learning and recall of efl learners
چکیده ندارد.
15 صفحه اولComparison of Short Read De Novo Alignment Algorithms
The objective of this paper is to survey the algorithms used for de novo alignment of short read data. Since the quality of the sequence bases which are aligned is important, this paper starts by comparing conventional sequencing methods and next-generation sequencing platforms. Next-generation sequencing poses new challenges to the bioinformatics community. A description of several de novo ali...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: F1000Research
سال: 2022
ISSN: ['2046-1402']
DOI: https://doi.org/10.12688/f1000research.108489.1